122 research outputs found

    Learning to Reconstruct Shapes from Unseen Classes

    Full text link
    From a single image, humans are able to perceive the full 3D shape of an object by exploiting learned shape priors from everyday life. Contemporary single-image 3D reconstruction algorithms aim to solve this task in a similar fashion, but often end up with priors that are highly biased by training classes. Here we present an algorithm, Generalizable Reconstruction (GenRe), designed to capture more generic, class-agnostic shape priors. We achieve this with an inference network and training procedure that combine 2.5D representations of visible surfaces (depth and silhouette), spherical shape representations of both visible and non-visible surfaces, and 3D voxel-based representations, in a principled manner that exploits the causal structure of how 3D shapes give rise to 2D images. Experiments demonstrate that GenRe performs well on single-view shape reconstruction, and generalizes to diverse novel objects from categories not seen during training.Comment: NeurIPS 2018 (Oral). The first two authors contributed equally to this paper. Project page: http://genre.csail.mit.edu

    Pix3D: Dataset and Methods for Single-Image 3D Shape Modeling

    Full text link
    We study 3D shape modeling from a single image and make contributions to it in three aspects. First, we present Pix3D, a large-scale benchmark of diverse image-shape pairs with pixel-level 2D-3D alignment. Pix3D has wide applications in shape-related tasks including reconstruction, retrieval, viewpoint estimation, etc. Building such a large-scale dataset, however, is highly challenging; existing datasets either contain only synthetic data, or lack precise alignment between 2D images and 3D shapes, or only have a small number of images. Second, we calibrate the evaluation criteria for 3D shape reconstruction through behavioral studies, and use them to objectively and systematically benchmark cutting-edge reconstruction algorithms on Pix3D. Third, we design a novel model that simultaneously performs 3D reconstruction and pose estimation; our multi-task learning approach achieves state-of-the-art performance on both tasks.Comment: CVPR 2018. The first two authors contributed equally to this work. Project page: http://pix3d.csail.mit.ed

    Visual Object Networks: Image Generation with Disentangled 3D Representation

    Full text link
    Recent progress in deep generative models has led to tremendous breakthroughs in image generation. However, while existing models can synthesize photorealistic images, they lack an understanding of our underlying 3D world. We present a new generative model, Visual Object Networks (VON), synthesizing natural images of objects with a disentangled 3D representation. Inspired by classic graphics rendering pipelines, we unravel our image formation process into three conditionally independent factors---shape, viewpoint, and texture---and present an end-to-end adversarial learning framework that jointly models 3D shapes and 2D images. Our model first learns to synthesize 3D shapes that are indistinguishable from real shapes. It then renders the object's 2.5D sketches (i.e., silhouette and depth map) from its shape under a sampled viewpoint. Finally, it learns to add realistic texture to these 2.5D sketches to generate natural images. The VON not only generates images that are more realistic than state-of-the-art 2D image synthesis methods, but also enables many 3D operations such as changing the viewpoint of a generated image, editing of shape and texture, linear interpolation in texture and shape space, and transferring appearance across different objects and viewpoints.Comment: NeurIPS 2018. Code: https://github.com/junyanz/VON Website: http://von.csail.mit.edu

    Height Information Aided 3D Real-Time Large-Scale Underground User Positioning

    Get PDF
    Due to the cost of inertial navigation and visual navigation equipment and lake of satellite navigation signals, they cannot be used in largeā€scale underground mining environment. To solve this problem, this study proposes largeā€scale underground 3D realā€time positioning method with seam height assistance. This method uses the ultrawide band positioning base station as the core and is combined with seam height information to build a factor graph confidence transfer model to realise3D positioning. The simulation results show that the proposed realā€time method is superior to the existing algorithms in positioning accuracy and can meet the needs of largeā€scale underground users

    Time Reversal Aided Bidirectional OFDM Underwater Cooperative Communication Algorithm with the Same Frequency Transmission

    Get PDF
    In underwater acoustic channel, signal transmission may experience significant latency and attenuation that would degrade the performance of underwater communication. The cooperative communication technique can solve it but the spectrum efficiency is lower than traditional underwater communication. So we proposed a time reversal aided bidirectional OFDM underwater cooperative communication algorithm. The algorithm allows all underwater sensor nodes to share the same uplink and downlink frequency simultaneously to improve the spectrum efficiency. Since the same frequency transmission would produce larger intersymbol interference, we adopted the time reversal method to degrade the multipath interference at first; then we utilized the self-information cancelation module to remove the self-signal of OFDM block because it is known for sensor nodes. In the simulation part, we compare our proposed algorithm with the existing underwater cooperative transmission algorithms in respect of bit error ratio, transmission rate, and computation. The results show that our proposed algorithm has double spectrum efficiency under the same bit error ratio and has the higher transmission rate than the other underwater communication methods

    Subclass-balancing Contrastive Learning for Long-tailed Recognition

    Full text link
    Long-tailed recognition with imbalanced class distribution naturally emerges in practical machine learning applications. Existing methods such as data reweighing, resampling, and supervised contrastive learning enforce the class balance with a price of introducing imbalance between instances of head class and tail class, which may ignore the underlying rich semantic substructures of the former and exaggerate the biases in the latter. We overcome these drawbacks by a novel ``subclass-balancing contrastive learning (SBCL)'' approach that clusters each head class into multiple subclasses of similar sizes as the tail classes and enforce representations to capture the two-layer class hierarchy between the original classes and their subclasses. Since the clustering is conducted in the representation space and updated during the course of training, the subclass labels preserve the semantic substructures of head classes. Meanwhile, it does not overemphasize tail class samples, so each individual instance contribute to the representation learning equally. Hence, our method achieves both the instance- and subclass-balance, while the original class labels are also learned through contrastive learning among subclasses from different classes. We evaluate SBCL over a list of long-tailed benchmark datasets and it achieves the state-of-the-art performance. In addition, we present extensive analyses and ablation studies of SBCL to verify its advantages
    • ā€¦
    corecore